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^J . Abstract 

>«• , Koiran [7] showed that if a n-variate polynomial of degree d (with 

C/j ■ d — n "') is computed by a circuit of size s, then it is also computed 

i^i . by a homogeneous circuit of depth four and of size 2°' v ^ log(d ' lo s( s )). 

Using this result, Gupta, Kamath, Kayal and Saptharishi (4] gave a 
exp ( O { \/d\og(d) log(n) log(s) J ) upper bound for the size of the smallest 

depth three circuit computing a n-variate polynomial of degree d = n '*' 
r — ■ given by a circuit of size s. 

p — ' We improve here Koiran's bound. Indeed, we show that if we 

I/") , reduce an arithmetic circuit to depth four, then the size becomes 

■^4- ' exp [O { ^dlog(ds) log(n) ) ). Mimicking the proof in [4], it also implies 

^^ ■ the same upper bound for depth three circuits. 

CO ' This new bound is not far from optimal in the sense that Gupta, 

Kamath, Kayal and Saptharishi [3| also showed a 2 ' ' lower bound for 
the size of homogeneous depth four circuits such that gates at the bottom 
have fan-in at most yd. Finally, we show that this last lower bound also 
holds if the fan-in is at least yd. 



1 Introduction 

Agrawal and Vinay proved [lj that if a n-variate polynomial / of degree d = 0(n) 
has a circuit of size 2°( d+dl ° s ^^, then / can also be computed by a depth- four 
circuit (Ell Ell) of size 2°( d+dl °s(3)). This result shows that for proving 
arithmetic circuit lower bounds or black-box derandomization of identity testing, 
the case of depth four arithmetic circuit is the general case in a certain sense. 
This result arose after other ones on parallelization. Valiant, Skyum, Berkowitz 
and Rackoff [9J proved that if a size-s depth-d circuit computes a polynomial 
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of degree d, then this polynomial can also be computed by a circuit of depth 
0(log(rf) log(s)) and of size bounded by a polynomial in s. Some years later, 
Allender, Jiao, Mahajan and Vinay [2] showed that this parallelization could be 
done uniformly. Their method for parallelization is reused in pQ and will be the 
basis for the parallelization in this paper. 

Agrawal and Vinay's result only deals with polynomials of sub-exponential 
complexity. But if the hypothesis is strengthened, it is possible to get a stronger 
conclusion. Indeed, Koiran [7] showed that if the circuit at the beginning is of 
size s, then it can be computed by a homogeneous depth- four circuit of size 
20(Vdiog(d)iog(s))^ p or exam pi e) if the permanent family is computed by a poly- 
nomial size circuit (i.e., of size n c ), then it is computed by a depth- four circuit 
of size 2°(v™ log ("W. These results appear as an interesting approach to lower 
bounds: if one finds a 2 u ^ n og ^ n '> lower bound on the size of depth-4 cir- 
cuits computing the permanent, then it will imply that there are no polynomial 
size circuits for the permanent. The interest of this approach is confirmed by 
Gupta, Kamath, Kayal and Saptharishi's recent result [3]. They showed that 
if a homogeneous Y] T\Y)T\ circuit where the bottom fan-in is bounded by t 
computes the permanent of a matrix of size n x n, then its size is 2 (t). In a 
recent paper [4], the same authors improve the upper bound by transforming 
n-variate circuits of size s and depth d (with d = n°^ l >) into depth-3 circuits 
of size exp [0{\/d logs log n log d)) , moreover if the input is a branching pro- 
gram (and not a circuit), the upper bound becomes exp (0(\/d log slogn)). In 

particular, this result gives a depth-3 circuit of size 2 °W n ° sn ) computing the 
determinant of a matrix n x n. Nevertheless, the depth-3 circuit they get is not 
homogeneous, and uses intermediate gates which compute polynomials of very 
high degree. 

In this paper we improve Koiran's bound. We show that a cir- 
cuit of size s can be parallelized homogeneously in depth 4 and in size 

exp (O ( ^dlog(ds) log(n) j j such that the fan-in of each multiplication gate 

is bounded by O ( yd t ° s — ) ■ We can notice that as n < s, the result implies 

Koiran's bound and is generally better (in the case where d,s = n e ^ 1 - ) , Koiran's 
bound is 2°^ lo s 2 n ) while the new bound is 2°(^ lo s™)). It implies that a 
2 u W n °s(™)j lower bound for depth-4 circuits computing the permanent gives 
a super-polynomial lower bound for general circuits computing the permanent. 
Moreover, using this result in Gupta, Kamath, Kayal and Saptharishi's proof 
instead of Koiran's result slightly improves the depth-3 upper bound. An n- 
variate circuit of size s and depth d is computed by a depth-3 circuit of size 
exp I 0(^/d\og(ds) logn) j . So, we get the same bound for the reduction at 

depth 3 starting from an arithmetic circuit as from an arithmetic branching 
program. Finally in Section we show, by a counting argument, that if a 
homogeneous X) II E II circuit where the bottom fan-in is lower-bounded by t 
computes the permanent (or the determinant) of a matrix of size n x n, then 
its size is 2 n (' lo s™). 



2 Arithmetic Circuits 

We give here a brief introduction to arithmetic circuits theory. The reader can 
find more detailed information in |101 [5J jS , 6J. In this theory, we measure the 
complexity of polynomial functions using arithmetic circuits. 

Definition 1. An arithmetic circuit is a finite acyclic directed graph with ver- 
tices of in-degree or more and exactly one vertex of out-degree 0. Vertices of 
in-degree are called inputs and labeled by a constant or a variable. The other 
vertices are labeled by x or + (or sometimes by Q in this paper) and called 
computation gates (the in-degree of these gates will be also called the fan-in). 
The vertex of out-degree is called the output. The vertices of a circuit are 
commonly called gates and its edges arrows. Finally, we call a formula, an 
arithmetic circuit such that the underlying graph is a tree. 

Each gate of a circuit computes a polynomial (defined by induction). The 
polynomial computed by a circuit corresponds to the polynomial computed by 
the output of this circuit. For a gate a, we denote [a] the polynomial computed 
by this gate. A circuit is called homogeneous is all its gates compute homoge- 
neous polynomials. In fact, for some proofs, we will use circuits with several 
outputs (each one corresponds to an out-degree gate). A ©-gate corresponds 
to a multiplication-by-a-scalar gate. The fan-in of such a gate will be always 2 
and at least one of its inputs corresponds to a constant (We will give a syntactic 
restriction just after the next definition). 

Definition 2. The size of a circuit is its number of gates. The depth is the 
maximal length of a directed path from an input to an output. The degree of a 
gate is defined recursively: any variable input is of degree 1, constant inputs are 
of degree 0, the degree of a + or Q-gate is the maximum of the incoming degrees 
and the degree of a x -gate is the sum of the incoming degrees. 

We can now put a restriction for the ©-gates. For each one of these gates, 
one of its child has to be of degree 0. 

For a given circuit we will consider graphs called parse trees. A parse tree 
corresponds, in the spirit, to the computation of one particular monomial. 

Definition 3. The set of parse trees of a circuit C is defined by induction on 
its size: 

• If C is of size 1 it has only one parse tree, itself. 

• If the output gate of C is a -\--gate whose arguments are the gates 
CCi, . . . ,ctk, then the parse trees of C are obtained by taking, for an ar- 
bitrary i < k, a parse tree of the sub-circuit rooted in at and the arrow 
from ai to the output. 

• If the output gate of C is a x-gate or an Q-gate whose arguments are the 
gates ai, ■ . . , (xu, the parse trees of C are obtained by taking disjoint copies 
of parse tree of the sub- circuits rooted in ai for all i < k and the arrows 
from all at to the output. 



The polynomial computed by a circuit C becomes the sum of the monomials 
computed by the parse trees of C. 

We will use some convenient notations which are defined in [5]. A depth-4 
circuit such that gates are multiplication gates at level one and three and addi- 
tion gates at levels two and four are denoted ^ II S II circuits. Furthermore, a 
X II z2 II circuit is a ^ II X II circuit such that the fan- in of the multipli- 
cation gates at level 3 is bounded by a, and the fan-in of the multiplication gates 
at level 1 is bounded by j3. For example, a X^II S II circuit computes a 
polynomial of the form: 

e n e n x ^m 

i=l j=lk=l 1=1 

where m < a, bij,k < $• 

3 Upper bounds 

Here, we state the main theorem in this paper. 

Theorem 4. Let f be an n-variate polynomial computed by a circuit of size 
s and of degree d. Then f is computed by a Y] TITHI circuit C of size 

2 V v / . Furthermore, if f is homogeneous, it will be also the case 

forC. 

The previous theorem can be directly applied for the permanent. 

Theorem 5. // the nx n permanent is computed by a circuit of size polynomial 
in n, then it is also computed by a X) II X II circuit of size 2 °W n °s(")J_ 

In their paper [4J, Gupta, Kamath, Kayal and Saptharishi used the previous 
2 log ( s ' bound [7] for parallelizing at depth 3. In fact, their proof is divided 
into three parts. First they transform circuits into depth-4 circuits, then they 
transform depth-4 circuits into depth-5 circuits using only sum and exponenti- 
ation gates. And finally they transform these last circuits into depth-3 circuits. 
Using Theorem [4] instead of Theorem 4.1 in their paper improves the first part 
of their proof. That implies a small improvement of Theorem 1.1 in [3]: 

Corollary 6. Let f(x) £ Q[x\, . . . ,x n ] be an n-variate polynomial of degree 
d = n ^ 1 ' computed by an arithmetic circuit of size s. Then it can also be 
computed by a J2 U E circuit of size 2°^ dlo s nl °ss) . 

4 Useful propositions 

For proving Theorem f4j we will need the following propositions. 
The next result is folklore. A proof can be found in [2]. 



Proposition 7. If f is a degree-d polynomial computed by a {+, x}-circuit C 
of size s such that the fan-in of each +-gate is unbounded and the fan-in of each 
x-gate is bounded by 2, then there exists a circuit C of size s(d+ l) 2 with d+1 
outputs Oq,Oi, . . . , Od such that: 

• the fan-in of each +-gate is unbounded, 

• the fan-in of each x-gate is bounded by 2, 

• for each i, the gate Oi computes the homogeneous part of f of degree i, 

• C is homogeneous, 

• the degree of each gate of C equals the degree of the polynomial computed 
by this gate. 

We define x -balanced { X , +, ©}-circuits. 

Definition 8. A {x, +, Q}-circuit C is called x -balanced if and only if all the 
following properties are verified: 

• the fan-in of each x -gate is at most 5, 

• the fan-in of each +-gate is unbounded, 

• the fan-in of each Q-gate is at most 2, 

• for each x -gate a, each one of its arguments is of degree at most half of 
the degree of a. 

The last condition can not be true for the multiplication by a scalar. It is 
the reason, we introduced the operator 0. 

The next proposition which is implicitly a first result of parallelization is 
almost the same result that we can find in Section 2 in [1J or in Theorem 2.7 
in [5] . We give a proof in appendix. 

Proposition 9. Let f be a homogeneous degree-d polynomial computed by a 
sizes circuit C defined as in the conclusion of Proposition Q Then f is com- 
puted by a homogeneous x -balanced { x , +, Q}- circuit of size s 6 + s 4 + 1 and of 
degree d. 

Agrawal and Vinay already noticed that Valiant, Skyum, Berkowitz and 
Rackoff's famous result [9| is a direct corollary of this proposition. 

Corollary 10. Let f be a polynomial of degree d computed by a circuit of 
size s. Then f is computed by a {+, x}-circuit of size (sd) ^ 1 ' and of depth 
0(log(s) log(d)) where each + and x-gate is of fan- in 2. 



5 Proof of Theorem |4] 

For realizing the reduction to depth four, Koiran begins by transforming the 
circuit into an equivalent arithmetic branching program. Then, he parallelizes 
the branching program, and finally comes back to the circuits. The problem 
with this strategy is that the transformation from circuits to branching programs 
requires an increase in the size of our object. If the circuit is of size s, our new 
branching program is of size s lo s( d ). Here, the approach is to directly parallelize 
the circuit without using arithmetic branching programs in intermediate steps. 
The idea is to split the circuit into two parts: gates of degree lower than 
yd and gates of larger degree. Furthermore, a circuit such that the degree of 
each gate is bounded by yd computes a degree- yd polynomial and so can be 
written as a sum of at most s ^ d ' monomials. Then, if each part of our circuit 
computes polynomials of degrees bounded by yd, we just have to get the two 
depth-2 circuits and connect them together. The main difficulty comes from the 
fact it is not always true that the sub-circuit obtained by the gates of degree 
larger than yd is of degree smaller than yd. For example, for the comb graph 
with n — 1 x -gates and n variable inputs: 

xi ■ (x 2 ■ (x 3 ■ (...))) 

the degree of the first part is i/n, but the degree of the second one is n — s/n. 
In fact, following ideas from (4], we are going to cut not exactly at level Vd. 
It will give a sharper result. 

Lemma 11. Let f be a homogeneous n-variate polynomial of degree d computed 
by a homogeneous x -balanced {x,+,©} -circuit C of size a . Then f is computed 

by a homogeneous ^2Yl X) II circuit of size 1 + ( <J ^^ a ) +o~ + o~( n j a ) +n 
for any positive constant a smaller than d. 

To get nicer expressions, we will use the following consequence of Stirling's 
formula: (A proof appears in [T|) 

Lemma 12. 

k + l \ _ 2 o(wiog£) 

First, let us see how Lemma ITT1 implies Theorem |4] 

Proof of Theorem [^} Let / be a n-variate polynomial computing by a circuit of 
size s and degree d. Let C be the homogeneous circuit for the polynomial that 
we get by Proposition [7J The circuit C is of size t = s(d + l) 2 and computes 
all polynomials fo,- ■ ■ ,fd where fi is the homogeneous part of / of degree i. 
Then for each i < d, there exists a homogeneous x -balanced circuit C of size 
a = t 6 + t A + I computing /j. We apply Lemma QT] for the circuit C with 

^To^ ■ Using Lemma [T^] we get a homogeneous ^ II S II circuit of size 



1 + { a XH a ) +a + a( n+ J) +n = 2°( v ' dlog<Tlosn ). At the end, we just have to add 

together homogeneous parts /j. As a — 0(s 6 d 12 ), it gives a 2 v * ° s s ° s ' 
upper bound for the size. D 

Remark 13. Choosing the easier assignment a = yd gives n 2 ' og ^ s 'l 
upper bound. 

Proving Lemma [TT] will complete the proof. 

Proof of Lemma[TI[ We define circuits C\ and C-i as follows. C\ is the circuit 
we get by keeping only gates of C of degree < -. Circuit C 2 is made up of 
the remaining gates (i.e., those of degree > -) and of the inputs of these gates. 
These inputs are the only gates which belong both in C\ and in Ci- 

Each gate a of C\ has degree at most - , so computes a polynomial of degree 
at most -. By homogeneity of C, the polynomial computed in a is homogeneous. 

Consequently, a is a homogeneous sum of at most ( d a J monomials, and so, 

can be computed by a homogeneous depth-2 circuit of size 1 + ( /)+« (The 

"1" encodes the +-gate, the "n" encodes the input gates, and the remainder 
encodes the x -gates). 

We are going to show now that the degree of C 2 is bounded by 15a. 

Let 5 be the degree of C%. There exists a degree-(5 monomial m in C%. Let 
T be a parse tree computing m. 

We partition the set of x -gates of T into 3 sets: 

• Q — {a E T\a is a x -gate and all children of a are leaves of T} 

• C/i={aGT|aisa x -gate and exactly one child of a is not a leaf} 

• t/2 = {ct£T|aisa x-gate and at least two children of a are not leaves}. 

Then, if we consider the sub-tree S of T with only gates of C2, then Qq are 
leaves of S, Q\ are internal vertices of fan-in 1 and Q2 are internal vertices of 
fan- in at least 2. 

The proof is in two parts. First we upperbound the size of the sets Qq, Q\ 
and Q2- Then, we upperbound the degree of m. 

In C, the degree of "m" is at least the sum of the degrees of the gates of Qq 
(since two of these gates can not appear on the same path) . Each one of these 
gates is in C2, so is of degree at least - in. C. As m is of degree at most d in C, 
it means that the number of gates in Q is at most a. 

In C, the degree of "m" is at least the sum of the degrees of the leaves directly 
connected to a gate of Q\ . For each gate a of Q\ , exactly one of its inputs j3 is 
in C2, hence of degree at least - in C. By Proposition [9J the degree of a is at 
least two times the degree of /3, it yields that the sum of degrees of inputs of a 
which are in C\ is also at least -. Then, the number of vertices in Q\ is at most 



Finally, in a tree, the number of leaves is larger than the number of vertices 
of fan-in at least 2. Then in S, we get that: 

|&| < |e?o| <a. 

In C2, the degree of the monomial to is the number of leaves labelled by 
a non-constant leaf in T. We match each leaf with the first x-gate which is 
connected to it. As in T, the fan-in of the x-gates is bounded by 5, the fan-in 
of the +- gates is bounded by 1 and each ©-gates add only one constant input, 
then the number of variable leaves connected to a particular x-gate is at most 
5. So the number of leaves in T is at most: 

5x(|0 o | + |0i| + |&|)<15a. 

This proves that the degree of Ci is at most 15a. Then, the number of inputs 
of Ci is bounded by the number of gates in C\ and so in C (which is a). So. 
there exists a depth- 2 circuit which compute C2, of size 1 + ( <T { 5 1 ^ a ) + cr with as 
inputs the gates of C\ . 

Consequently, each polynomial fi can be computed by a homogeneous 

E I1 [q1 E n^ circuit of size at most 1 + (°"^ a ) + a + <r( n+ J) + n. U 

6 A lower bound 

In [3J, it was proved that if a homogeneous depth- four circuit computing Perm„ 

has its bottom fan- in bounded by t, then the size of the circuit is at least 2 \ T ). 
But what happens if bottom multiplication gates all have a large fan-in? We 
show that this implies a similar lower bound for the size of the circuit: 

Theorem 14. If C is a homogeneous E II E II circuit which computes Perm„ 
(or T)ET n ) such that the fan-in of each bottom multiplication gate is at least t, 
then the size of C is at least 2 n ( tlo s(")). 

Our approach is only based on counting the number of monomials. We begin 
by some definitions. 

Definition 15. For a multivariate polynomial /(x) = Ei=fi a » x i; we w ^ denote 
Mf the set {xi | Xi is a monomial of /}. If E is a set of polynomials, we also 
define M E = \Jf&E M f- 

We can notice that A^p ERMn = { X\ tCr n\ . . . x n:(T i n ) \ a G ©„}. So, 
|A4perm„| =n\. 

Definition 16. Let E be a set of polynomials. Let us denote 

E + = { A + . . . + f m I to G N and Vi < m, fi £ E } 

and 

E xk = { f\ x . . . x f m J to < k and Vi < to, /, G E } 



Lemma 17. Let E be a set of polynomials. Then, 

• \M E+ \ < \M E \, 

• \M E ,s\<{\M E \ + l) s . 

Proof. If x is a monomial in Ai E +, it means there exist polynomials /i, . . . , f m 
in E such that x is a monomial of f\ + . . . + f m ■ Then there exists i < m such 
that x is a monomial of fi and so x is an element of M.e- Hence Ai E + Q-M-e- 
Moreover, if x is a monomial in A4.e xb , it means there exist polynomials 
/i; • ■ • i fm in E such that x is a monomial of /i X ... X f m with m < s. It 
implies that x G {xi x ... x x m \ m < s and Xi £ Me }■ That is to say, 
x e { xi x . . . x x s | and Xi G {M.e U {1}) }■ It proves the lemma. □ 

Let C be a ^ YIJ2 Yi circuit. The gates of the circuit are layered into five 
levels. Inputs are at level 0, multiplication gates at levels 1 and 3 and addition 
gates at levels 2 and 4. For each level i, let us denote Si the number of gates 
at this level, U an upper bound on the fan-in of these gates and Ei the set of 
polynomials computed at this level. 

Lemma 18. Any X) II S II c * rcu ^ that computes Perm„ (or T>ET n ) such that 
the fan-in of the multiplication gates at level 3 is bounded by v must have size 
exp[n(aiog(n))]. 

Proof. We notice that the hypothesis in the lemma about the bound of the 
fan-in just states that £3 < v. 

The polynomials in E\ are just monomials. So, I-Msj < Si. Furthermore, 
we have: 



E A C EJ 
E 3 C E* t3 
E 2 C E+ 
Then by Lemma [T71 

\M Ei \<(si + l) t3 <(si + l) v . 
However, as Perm„ is an element of E4, we also have: 

\M Ei \ > IA^PermJ =n\. 
So, si > (n!)« - 1 = 2 n ( - log (")) D 

The result of this lemma directly implies Theorem 1141 

Proof of Theorem \1^\ Let C be a homogeneous X) II S II circuit which com- 
putes Perm„ (or Det„) such that the fan- in of each bottom gate is at least 
t. It implies that the degree of each gate at level 1 and 2 is at least t. As the 
circuit is homogeneous, the degree of a gate at level 3 is upperbounded by n 
and lowerbounded by t times the number of inputs of this gate. Consequently, 
in C, the fan- in of the multiplication gates at level 3 is bounded by j. Then 
Lemma IT51 implies the theorem. □ 



In fact, for computing the determinant, we can also notice that the fan-in of 
multiplication gates in the depth-four circuits that we get either in [7\ or here 
in Section [SJ is linear in yfn. It implies that in this case, the bounds are tight. 

Corollary 19. If C is a X) II X II circuit which computes Det„ such that 
the fan-in of each bottom multiplication gate is Q(y/n) or such that the fan-in 
of each multiplication gate of level 3 is 0(\fn), then the minimal size of C is 
2©( v / "log(".)) 

Proof. Koiran's result [7] implies that there exist depth-four circuits for DET n 
of size 2 (v™ logn ' ) such that all multiplication gates have fan-in bounded by 
0(y/n). For the lowerbound, the case where the bottom fan-in is lowerbounded 
by £l(y/n) is given by Theorem [TU The case where the fan- in of gates of level 
3 is bounded by 0(y/n) is given by Lemma IT51 □ 

Consequently, it would be an interesting question to know the lower bound 
on the size of an homogeneous circuit computing Det„. In [3J the authors show 
that if the circuit is such that the fan-in of bottom gates is bounded by 0{^/n), 
then the size is 2^™. Here, we show that if all bottom fan-in are lowerbounded 
by Q,(y/n), then the size is 2 f2 (\ / " 1 °g"). What happens if in the circuit, there 
are some bottom gates with a large fan-in and some bottom gates with a small 
fan-in? 

Open question 1. Is it true that ifC is a homogeneous depth-four circuit which 
computes Det„ then the size of C is at least 2°^^ ? 
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Appendix: Proof of Proposition M 

Let / be a homogeneous polynomial computed by a circuit C of size s like in 
the proposition. First, we can delete the "calculus with constants". To do that, 
we just have to replace recursively each gate such that all entries are constants 
by the constant value of this gate. Then, by homogeneity, constants can not be 
entries of a +- gate. Then, for each x-gate such that one entry is a constant, we 
replace the x-gate by a scalar Q-gate. We can notice that this transformation 
does not increase the size of the circuit. Second, we can reorder the children of 
the x -gates and of the ©-gates so as to for each one of these gates, the degree 
of the rightmost child is larger or equals the degree of the other child. We get 
a circuit C\ of size s. 

We define now a new circuit C2 which satisfies the criteria of the proposition. 
For each pair of gates a and /3 in C±, we define the gate (a; j3) in Ci as follows: 

• If /3 is a leaf, then [(a; j3)] equals the sum of the parse trees rooted in a 
such that j3 appears in the rightmost path (ie, j3 is the leaf of the rightmost 
path). 

• If j3 is not a leaf, then [(a; /?)] equals the sum of the parse trees rooted 
in a such that f3 appears in the rightmost path and where the subcircuit 
rooted in /3 is deleted. That is as if we replace the gate j3 by the input 1 
in the rightmost path and we compute [(a;/3)] with /3 — 1 a leaf. 

We can notice here that it is easy to get the polynomial computed by the gate 
a: 



[a] = X>;01- 



I leaf 
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Now, we show how one can compute the value of the gates (a; (3). 

• If /3 does not appear on the rightmost path of a sparse tree rooted in a, 
then (a; /3) = 0. 

• If a is a leaf, then (a; a) — a and else (a; a) = 1. 

• Otherwise a and /3 are two different gates and so a is not a leaf. If a is a 
+-gate, then [(a;/3)] is simply the sum of all [(a',/3)], where a' is a child 
of a. 

• If a is a 0-gate, then one child is a constant c and the other child is a gate 
a' . Then (a; /?) is simply the scalar operation [(a; /?)] = [(c; c)] [(a'; /?)]. 

• If a is a x-gate. There are two cases. 

— First case: /3 is a leaf. Thendeg(a) > deg(/3) = 1. On each rightmost 
path ending on /3 of a sparse tree rooted in a, there exists exactly 
one x-gate 7 and its right child on this path j r such that: 

deg(7) > 2 de g(«) > deg( 7r ). (1) 

Conversely, we notice that for each gate 7 satisfying ([I]), if [(a; 7)] 
and [(7,.; /3)] are not zero, then 7 is on a rightmost path from a to f3. 
Then, 



[(«;£)]= L [(a;7)fc0H(7 r ;/3)]. 



E 

Z leaf, 7 x-gate verifying Q 

One can notice that deg(a;/3) = deg(a). Using (TJTJ): 

deg(a; 7) = deg(a) - deg(7) < - deg(a) 

deg(7 r ; (3) = deg(7 r ) < - deg(a) 

deg(7/; I) = deg(7/) < deg(7 r ) < - deg(a). 

Consequently, [(a; f3)] is computed by a depth-2 circuit of size at most 
s 2 + l: a +-gate where each child is a x-gate of fan-in 3. Furthermore, 
each child of these x -gates is of degree at most the half of the degree 
of the x-gate. 

— Second case: /3 is not a leaf. Then there exists on every rightmost 
paths rooted in a a x-gate 7 and its child on this path "f r such that: 

deg( 7 ) > ^(deg(a) + deg(/3)) > deg( 7r ). (2) 
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Then by the same argument, 

[(«;£)]= E [(«;7)fcO][(7r;0)]. (3) 

/ leaf, 7 x-gate verifying Ol 

We have this time with ©: 

deg(a; /3) = deg(a) - deg(/3) 
deg(a; 7) = deg(a) - deg(7) < - (deg(a) - deg(/3)) 

deg(7 r ;/3) = deg(7 r ) < - (deg(a) - deg(/3)) . 

The problem here is that the degree of (7;;/) could be larger than 
the average of the degrees of a and (3. If 7; is of degree at most 1 
(and so exactly 1) and if the degree of (a; f3) is also 1, then 7 = a 
(they are the same gate) and (7,,;/?) is of degree and computes a 
constant c 7 . Hence, 

[(«;/3)]= E K]©[(7*;0]- 

l leaf, 7 x-gate verifying l f2l 

Now, if the degree of ji is again 1 but if (a; j3) is of degree at least 
2, then the computation of the gate (a; /?) by the formula (J3j) works 
(ie., the degree of (7;; I) is smaller than half of the degree of (a; j3)). 
Otherwise, the degree of 7; is at least 2 and at most deg(a;/3). As 
/ is a leaf, we can apply the first case (even if 7; is not a x-gate). 
There exists also on every rightmost paths rooted in 7; a x-gate /x 
and its child on this path /x r such that: 

deg(^) > - deg(7;) > deg(^ r ). (4) 

Then, 

[(«;#]= E [(«;7)][(7r;j9)][(7j;A*)][(w;W][0*r;ii)]. 

Zi ,^2 leaves, 
7 X -gate verifying f2l , 
/i X -gate verifying (J4j 

(5) 

The degrees of the gates (7;;//), (fJ.i',h) and (fi r ;li) are bounded by 
half of the degree of 7;. Hence, [(a; /3)] is computed by a depth-2 
size-s 4 + 1 circuit. The x -gates are of fan- in bounded by 5 and the 
degree of their children is bounded by half their degree. 

Consequently, for each gates a and /3 in C\, the gate (a; /3) is computed in 
G-2 by a sub-circuit of size at most s + 1. At the end we get a circuit of size 
at most s 6 + s 2 which computes all gates (a; (3). Finally, / is computed by a 
circuit of size bounded by s 6 + s 2 + 1. 

That proves the proposition. 
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